AITopics | equivariant transformer

Collaborating Authors

equivariant transformer

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Appendix

Neural Information Processing SystemsApr-24-2026, 11:07:14 GMT

The introduction of convolution and attention to the space of rays in 3D required additional geometric representations for which there was no space in the main paper to elaborate. We will introduce here all the necessary notations and definitions. We have accompanied this presentation with examples of specific groups to elucidate the abstract concepts needed in the definitions. Figure 10: The visualization of Plücker coordinates: A ray xcan be denoted as (d,m)where x is any point on the ray x, and dis the direction of the ray x. mis defined as x d. Given the action of the group G on a homogeneous space X, and given x0 as the origin of X, the stabilizer group H of x0 in G is the group that leaves x0 intact, i.e., H = {h G|hx0 = x0}. The group, G, can be partitioned into the quotient space (the set of left cosets) G/H and X is isomorphic to G/H since all group elements in the same coset transform x0 to the same element in X, that is, for any element g gH we have g x0 = gx0. Example 1. SE(3) acting on the ray space R: Take SE(3) as the acting group and the ray space R as its homogeneous space. We use Plücker coordinates to parameterize the ray space R: any x R can be denoted as (d,m), where d S2 is the direction of the ray, and m = x d where x is any point on the ray, as shown in figure 10. R is the quotient space SE(3)/(SO(2) R)up to isomorphism. Example 2. SE(3) acting on the 3DEuclidean space R3: R3 is isomorphic to SE(3)/SO(3). Consider another case when SE(3) acts on the homogeneous space R3; for any g = (R,t) SE(3) and x R3, gx = Rx+t. If the fixed origin is [0,0,0]T, the stabilizer subgroup is H = SO(3) since any rotation g = (R,0)leaves [0,0,0]T unchanged. The last example is SO(3) acting on the homogeneous space sphere S2. Given the fixed origin point as [0,0,1]T, the stabilizer group is SO(2).

artificial intelligence, convolution, machine learning, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.93)

Add feedback

075b2875e2b671ddd74aeec0ac9f0357-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-7-2026, 12:55:24 GMT

convolution, representation, transformer, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.93)

Add feedback

Equivariant Light Field Convolution and Transformer in Ray Space

Neural Information Processing SystemsOct-8-2025, 01:19:58 GMT

Meanwhile, we make the kernel locally supported without breaking the equivariance.

artificial intelligence, convolution, machine learning, (17 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Data Science (0.67)

Add feedback

Does equivariance matter at scale?

Brehmer, Johann, Behrends, Sönke, de Haan, Pim, Cohen, Taco

arXiv.org Artificial IntelligenceOct-30-2024

Given large data sets and sufficient compute, is it beneficial to design neural architectures for the structure and symmetries of each problem? Or is it more efficient to learn them from data? We study empirically how equivariant and non-equivariant networks scale with compute and training samples. Focusing on a benchmark problem of rigid-body interactions and on general-purpose transformer architectures, we perform a series of experiments, varying the model size, training steps, and dataset size. We find evidence for three conclusions. First, equivariance improves data efficiency, but training non-equivariant models with data augmentation can close this gap given sufficient epochs. Second, scaling with compute follows a power law, with equivariant models outperforming non-equivariant ones at each tested compute budget. Finally, the optimal allocation of a compute budget onto model size and training duration differs between equivariant and non-equivariant models.

architecture, arxiv preprint arxiv, transformer, (13 more...)

arXiv.org Artificial Intelligence

2410.23179

Country:

Europe > Spain > Andalusia > Granada Province > Granada (0.04)
Europe > Netherlands (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.93)

Industry:

Leisure & Entertainment (0.67)
Health & Medicine > Diagnostic Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Euclidean, Projective, Conformal: Choosing a Geometric Algebra for Equivariant Transformers

de Haan, Pim, Cohen, Taco, Brehmer, Johann

arXiv.org Artificial IntelligenceNov-8-2023

The Geometric Algebra Transformer (GATr) is a versatile architecture for geometric deep learning based on projective geometric algebra. We generalize this architecture into a blueprint that allows one to construct a scalable transformer architecture given any geometric (or Clifford) algebra. We study versions of this architecture for Euclidean, projective, and conformal algebras, all of which are suited to represent 3D data, and evaluate them in theory and practice. The simplest Euclidean architecture is computationally cheap, but has a smaller symmetry group and is not as sample-efficient, while the projective model is not sufficiently expressive. Both the conformal algebra and an improved version of the projective algebra define powerful, performant architectures.

equivariant transformer, geometric algebra, projective, (3 more...)

arXiv.org Artificial Intelligence

2311.04744

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

Equivariant Transformer is all you need

Tomiya, Akio, Nagai, Yuki

arXiv.org Artificial IntelligenceOct-19-2023

Machine learning, deep learning, has been accelerating computational physics, which has been used to simulate systems on a lattice. Equivariance is essential to simulate a physical system because it imposes a strong induction bias for the probability distribution described by a machine learning model. This reduces the risk of erroneous extrapolation that deviates from data symmetries and physical laws. However, imposing symmetry on the model sometimes occur a poor acceptance rate in self-learning Monte-Carlo (SLMC). On the other hand, Attention used in Transformers like GPT realizes a large model capacity. We introduce symmetry equivariant attention to SLMC. To evaluate our architecture, we apply it to our proposed new architecture on a spin-fermion model on a two-dimensional lattice. We find that it overcomes poor acceptance rates for linear models and observe the scaling law of the acceptance rate as in the large language models with Transformers.

equivariant transformer

arXiv.org Artificial Intelligence

2310.13222

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

PolyGET: Accelerating Polymer Simulations by Accurate and Generalizable Forcefield with Equivariant Transformer

Feng, Rui, Tran, Huan, Toland, Aubrey, Chen, Binghong, Zhu, Qi, Ramprasad, Rampi, Zhang, Chao

arXiv.org Artificial IntelligenceSep-1-2023

Polymer simulation with both accuracy and efficiency is a challenging task. Machine learning (ML) forcefields have been developed to achieve both the accuracy of ab initio methods and the efficiency of empirical force fields. However, existing ML force fields are usually limited to single-molecule settings, and their simulations are not robust enough. In this paper, we present PolyGET, a new framework for Polymer Forcefields with Generalizable Equivariant Transformers. PolyGET is designed to capture complex quantum interactions between atoms and generalize across various polymer families, using a deep learning model called Equivariant Transformers. We propose a new training paradigm that focuses exclusively on optimizing forces, which is different from existing methods that jointly optimize forces and energy. This simple force-centric objective function avoids competing objectives between energy and forces, thereby allowing for learning a unified forcefield ML model over different polymer families. We evaluated PolyGET on a large-scale dataset of 24 distinct polymer types and demonstrated state-of-the-art performance in force accuracy and robust MD simulations. Furthermore, PolyGET can simulate large polymers with high fidelity to the reference ab initio DFT method while being able to generalize to unseen polymers.

accelerating polymer simulation, accurate and generalizable forcefield, equivariant transformer, (1 more...)

arXiv.org Artificial Intelligence

2309.00585

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)

Add feedback

Equiformer: Equivariant Graph Attention Transformer for 3D Atomistic Graphs

Liao, Yi-Lun, Smidt, Tess

arXiv.org Artificial IntelligenceFeb-27-2023

Despite their widespread success in various domains, Transformer networks have yet to perform well across datasets in the domain of 3D atomistic graphs such as molecules even when 3D-related inductive biases like translational invariance and rotational equivariance are considered. In this paper, we demonstrate that Transformers can generalize well to 3D atomistic graphs and present Equiformer, a graph neural network leveraging the strength of Transformer architectures and incorporating SE(3)/E(3)-equivariant features based on irreducible representations (irreps). First, we propose a simple and effective architecture by only replacing original operations in Transformers with their equivariant counterparts and including tensor products. Using equivariant operations enables encoding equivariant information in channels of irreps features without complicating graph structures. With minimal modifications to Transformers, this architecture has already achieved strong empirical results. Second, we propose a novel attention mechanism called equivariant graph attention, which improves upon typical attention in Transformers through replacing dot product attention with multi-layer perceptron attention and including non-linear message passing. With these two innovations, Equiformer achieves competitive results to previous models on QM9, MD17 and OC20 datasets.

artificial intelligence, machine learning, vector, (19 more...)

arXiv.org Artificial Intelligence

2206.1199

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Germany > Berlin (0.04)

Genre: Research Report (0.50)

Industry: Materials > Chemicals (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

TorchMD-NET: Equivariant Transformers for Neural Network based Molecular Potentials

Thölke, Philipp, De Fabritiis, Gianni

arXiv.org Artificial IntelligenceFeb-5-2022

The prediction of quantum mechanical properties is historically plagued by a trade-off between accuracy and speed. Machine learning potentials have previously shown great success in this domain, reaching increasingly better accuracy while maintaining computational efficiency comparable with classical force fields. In this work we propose TorchMD-NET, a novel equivariant transformer (ET) architecture, outperforming state-of-the-art on MD17, ANI-1, and many QM9 targets in both accuracy and computational efficiency. Through an extensive attention weight analysis, we gain valuable insights into the black box predictor and show differences in the learned representation of conformers versus conformations sampled from molecular dynamics or normal modes. Furthermore, we highlight the importance of datasets including off-equilibrium conformations for the evaluation of molecular potentials.

arxiv, attention weight, molecule, (16 more...)

arXiv.org Artificial Intelligence

2202.02541

Country: